Restructuring Databases for Knowledge Discovery by Consolidation and Link Formation

نویسندگان

  • Henry G. Goldberg
  • Ted E. Senator
چکیده

Databases often inaccurately identify entities of interest. Two operations, consolidation and link formation, which complement the usual machine learning techniques that use similarity-based clustering to discover classifications, are proposed as essential components of KDD systems for certain applications. Consolidation relates identifiers present in a database to a set of real world entities (RWE's) which are not uniquely identified in the database. Consolidation may also be viewed as a transformation of representation from the identifiers present in the original database to the RWE's. Link formation constructs structured relationships between consolidated RWE's through identifiers and events explicitly represented in the database. An operational knowledge discovery system which identifies potential money laundering in a database of large cash transactions implements consolidation and link formation. Consolidation and link formation are easily implemented as index creation in relational database management systems.*

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Restructuring Transactional Data for Link Analysis in the FinCEN AI System

Due to the nature and costs of data collection, many realworld databases consist of large numbers of independent transactions. Finding evidence of structured groups of entities reflected in this data is a task aptly suited to Link Analysis. However, the databases usually must be restructured to allow effective search and analysis of the linkage structures hidden in the original transactions. Th...

متن کامل

بررسی کاربردهای داده کاوی در نظام سلامت

Introduction: Extensive amounts of data stored in medical databases require the development of specialized tools for accessing the data, data analysis, knowledge discovery, and the effective use of the data. Data mining is one of the most important methods. The article sketches the used Data Mining techniques, and illustrates their applicability to medical diagnostic and prognostic problems. ...

متن کامل

Cross Border Mergers and Acquisitions by Indian firms-An Analysis of Pre and Post Merger performance

The corporate sector all over the world is restructuring its operations through different types  of  consolidation  strategies  like  mergers  and  acquisitions  in  order  to  face challenges  posed  by  the  new  pattern  of  globalization,  which  has  led  to  the  greater integration of national and international markets.. The intensity of cross-border operations recorded an unprecedented ...

متن کامل

Preprocessing and Integration of Data from Multiple Sources for Knowledge Discovery

The explosive growth in the generation and collection of data has generated an urgent need for a new generation of techniques and tools that can assist in transforming these data intelligently and automatically into useful knowledge. Knowledge discovery is an emerging multidisciplinary field that attempts to fulfill this need. Knowledge discovery is a large process that includes data selection,...

متن کامل

Extracting Prior Knowledge from Data Distribution to Migrate from Blind to Semi-Supervised Clustering

Although many studies have been conducted to improve the clustering efficiency, most of the state-of-art schemes suffer from the lack of robustness and stability. This paper is aimed at proposing an efficient approach to elicit prior knowledge in terms of must-link and cannot-link from the estimated distribution of raw data in order to convert a blind clustering problem into a semi-supervised o...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1995